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Climate Informatics: Accelerating 
Discovering in Climate Science with 
Machine Learning 

The goal of climate informatics an emerging discipline , is to inspire collaboration between 
climate scientists and data scientists , in order to develop tools to analyze complex and 
ever-growing amounts of observed and simulated climate data , and thereby bridge the gap 
between data and understanding. Here , recent climate informatics work is discussed along 
with some of the field's remaining challenges. 


T he impacts of present and potential 
future climate change pose impor- 
tant scientific and societal chal- 
lenges. Scientists have observed 
changes in temperature, sea ice, and sea level, 
and attributed those changes to human activity. 
It is an urgent international priority to improve 
our understanding of the climate system — a 
system characterized by complex phenomena 
that are difficult to observe and even more 
difficult to simulate. Despite the increasing 
availability of computational resources, cur- 
rent analytical tools have been outpaced by 
the ever-growing amounts of observed climate 
data from satellites, environmental sensors, and 
climate-model simulations. Computational ap- 
proaches will therefore be indispensable for 
these analysis challenges. The goal of the fledg- 
ling research discipline, climate informatics , is to 
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inspire collaboration between climate scientists 
and data scientists (machine learning, statistics, 
and data mining researchers), and thus bridge 
the gap between data and understanding. Re- 
search on climate informatics will accelerate 
discovery and answer pressing questions in cli- 
mate science. 

Machine learning is an active research area at 
the interface of computer science and statistics. 
The goal of machine learning research is to de- 
velop algorithms , automated techniques, to detect 
patterns in data. Such algorithms are critical to a 
range of technologies including Web search, rec- 
ommendation systems, personalized Internet ad- 
vertising, computer vision, and natural language 
processing. Machine learning also benefits the 
natural sciences, such as biology; the interdisci- 
plinary bioinformatics field has facilitated many 
discoveries in genomics and proteomics. The im- 
pact of machine learning on climate science has 
the potential to be similarly profound. 

Here, we focus specifically on challenges in 
climate modeling; however, there are myriad 
collaborations possible at the intersection of 
these two fields. Recent work reveals that col- 
laborations with climate scientists also generate 
interesting new problems for machine learning . 1 
To broaden the discussion, we propose chal- 
lenge problems for climate informatics, some 
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of which we discuss in detail elsewhere, 2 and 
review recent successes in climate informatics. 
Climate scientists and machine learning, data 
mining, and statistics researchers discuss these 
topics at the Climate Informatics Workshop (an 
annual event we launched in 2011). Our prior 
work, including a survey, provides further dis- 
cussion of related work on climate informat- 
ics. 1-3 Additional resources are available on the 
climate informatics wiki (http://sites.google. 
com/site/lstclimateinformatics). 

Climate Modeling 

Climate scientists use climate models , large-scale 
mathematical models run as computer simula- 
tions, to understand and predict the climate. 
Geophysical experts, including climate scientists 
and meteorologists, encode data from observed 
processes into highly complex, nonlinear math- 
ematical models. As shown in Figure 1, each gen- 
eral circulation model (GCM) includes numerous 
individual climate-process models, such as cloud 
formation, rainfall, wind, ocean currents, and 
radiative heat transfer through the atmosphere. 
Results emergent from the models, such as the 
sensitivity of the climate to increasing green- 
house gases, are crucial to researchers trying to 
project and forecast the Earth’s climate. 4 (It is im- 
portant to note that whereas in machine learning, 
data mining, and statistics, the term model typi- 
cally means data-driven, in climate modeling this 
term refers to a system of mathematical models 
that are based on scientific first principles). 

Global climate modeling efforts began in the 
1970s. The models have become more complex 
as computational resources have evolved. Cur- 
rently, about 25 laboratories across the world 
support almost 50 climate models, forming the 
basis of climate projections (predictions) as- 
sessed by the Intergovernmental Panel on Cli- 
mate Change (IPCC), which was established by 
the United Nations in 1988 and received the 
2007 Nobel Peace Prize (shared with former 
Vice President A1 Gore). Climate scientists ini- 
tially developed the Coupled Model Intercom- 
parison Project version 3 (CMIP3) archive to 
support the IPCC Fourth Assessment Report. 5 
Researchers have used the archive in more than 
500 publications, and it’s a rich source of cli- 
mate simulation output. The CMIP5 project 
continues the tradition of making global cli- 
mate-model predictions easier to use, and it will 
be quite significant in future IPCC reports. 

The multi-model ensemble (MME), the en- 
semble of climate models that informs the 
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Figure 1. A global climate model discretization, and a selection of 
included physical processes. The US National Oceanic and Atmospheric 
Administration (NOAA) website offers this information and much more 
concerning climate change (http://celebrating200years.noaa.gov/ 
breakthroughs/climate_model/welcome.html). 


IPCC, has high variance over model predic- 
tions, for a variety of reasons. Different teams 
of scientists designed each model based on 
scientific first principles, which led to differ- 
ences in scope, discretization assumptions, 
the included science, and coding errors. Even 
though the different models’ predictions vary 
greatly, some climate scientists observed that 
overall, the average prediction (over multiple 
quantities, performance metrics, and time pe- 
riods) is more consistent than any one model. 6,7 
There has been growing interest, in the cli- 
mate-modeling community, in better ways to 
combine MME predictions, as well as methods 
to assess the “skill” of a single climate model 
(quantifying the model’s accuracy over a naive 
prediction). Researchers attempting to rank or 
weight models must show that the choices are 
meaningful for the specific context. One ap- 
proach supported by the climate science com- 
munity is the perfect model assumption. In this 
framework, researchers assume one model to 
be the “truth,” then, over a calibration interval, 
they evaluate prediction methods trained on 
simulated observations generated by the “true” 
model. Scientists discussed these issues at an 
IPCC Expert Meeting on Assessing and Com- 
bining Multi-Model Climate Projections. 8,9 
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Tracking Climate Models 

Our research applies machine learning al- 
gorithms to the problem of tracking the 
multi-model ensemble. 1,3,9 Our results on tem- 
perature data (observed and predicted tempera- 
ture anomalies averaged over global, regional, 
annual, and monthly scales) show that our algo- 
rithm produces predictions that nearly match, 
and sometimes surpass, the results of the best 
model for the entire observation sequence. This 
is significant, because only in hindsight can one 
determine the best model for the whole obser- 
vation sequence. We used online learning algo- 
rithms with the goal of making both real-time 
and future predictions. Moreover, our research 
shows that the naive “batch” approach has dis- 
advantages due to the nonstationary nature of 
the observations and the relatively short history 
of model prediction data. 

Climate scientists use temperature anoma- 
lies to express both the climate-model predic- 
tions and the true observations. A temperature 
anomaly is the difference between the observed 
temperature and the temperature at the same 
location at a fixed, benchmark time. To put it 
another way, anomalies are measurements of 
temperature change. Climate scientists use 
temperature anomalies because, while tem- 
peratures vary widely over geographical loca- 
tion, temperature anomalies typically vary less. 
For example, in a particular month it might be 
80°F in New York, and 70°F in Toronto, but 
the anomaly from the benchmark time might 
be 1°F in both places. Thus, variance is lower 
when researchers average temperature anoma- 
lies over many geographic locations, than when 
they use absolute temperatures. Figure 2 shows 
climate model simulation runs, and observation 
data, averaged over many geographical loca- 
tions, and many times in a year, yielding one 
value for a global mean temperature anomaly 
per year. In this case, researchers baselined the 
benchmark over the period 1951-1980 (one can 
convert between benchmark eras by subtract- 
ing a constant). The figure shows the climate 
model predictions we used as input to our glob- 
al annual experiments, where the thick red line 
is the mean prediction over all models, in both 
plots. The thick blue line indicates the true 
observations. 

We obtained our results on Tracking Climate 
Models (TCM) 1 by applying the Learn- a 
algorithm, 10 which tracks a shifting sequence 
of temperature values with respect to “expert” 
predictions, which we used to represent the 


climate models. 1 In our previous work, 10 bring- 
ing a view from probabilistic graphical models 
to bear on traditional algorithms for “online 
learning with expert advice,” 11,12 we re-derived 
such algorithms as Bayesian updates of a Hid- 
den Markov Model (HMM), in which the 
identity of the current best expert is the hidden 
variable. This allowed us to introduce an algo- 
rithm that learns the switching rate between 
best experts, while simultaneously performing 
the original prediction task. 10 When we apply 
the Learn- a algorithm to the climate -model 
setting, the algorithm learns hierarchically, 
based on a set of generalized HMMs, where 
the hidden variable is the current best climate 
model’s identity. 

We ran experiments on NASA’s histori- 
cal temperature data (http://data.giss.nasa. 
gov/gistemp), averaged annually and globally, 
from 1900 through 2009, as well as the cor- 
responding predictions of 20 different climate 
models per year (from the CMIP3 archive at 
www-pcmdi.llnl.gov/ipcc/about_ipcc.php). 
These model simulations started from an ap- 
proximately stable climate in the 19th centu- 
ry, and were stepped forward using estimates 
of changes in the external drivers of climate 
change (greenhouse gases, volcanoes, atmo- 
spheric particulates, land-use changes, and so 
on). However, the model dynamics self-gen- 
erate the month-to -month and year-to-year 
variability. The GCM output wasn’t informed 
by observations, therefore it’s valid to run his- 
torical experiments using the GCM ensemble 
predictively on historical data. We also ran 
experiments using climate model predictions 
through the year 2098, in order to harness the 
climate models’ future predictions. Of course, 
there’s no observation data in the future with 
which we could evaluate the machine learning 
algorithms. So, to achieve this goal, we ran fu- 
ture simulations using the scientific communi- 
ty’s “perfect model” assumption; we fixed one 
climate model, then used its predictions as the 
quantity to learn based only on the remaining 
19 climate models’ predictions (and repeated 
this process 10 times). 

We also ran experiments at higher spatial 
and temporal granularity. We used hind- 
casts of the IPCC global climate models and 
the analogous true observations, over spe- 
cific geographical regions corresponding to 
several continents, at monthly and annual 
time scales. The predicted quantity was still 
a temperature anomaly. However, the data 
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Figure 2. Global mean temperature anomalies, (a) Climate model predictions through 2098, with observations 
through 2008. The black vertical line separates past (hindcasts) from future predictions, (b) Here, we zoom in 
on observations and model predictions through 2008. The legends refer to both figures. 


was averaged over a smaller geographical re- 
gion than the whole globe; in particular, we 
ran experiments for latitude-longitude boxes 
corresponding to Africa, Europe, and North 
America. In addition to annual experiments, 
we also ran experiments using monthly aver- 
ages in each of the regions. NASA provided 
observed data (http://data.giss.nasa.gov/gis- 
temp) and the Koninklijk Nederlands Meteo- 
rologisch Instituut (KNMI) Climate Explorer 
(http://climexp.knmi.nl) provided the model- 
prediction data. Both model and observation 


data spanned from January 1900 through 
October 2010 (1,330 months). We also used 
monthly regional model predictions through 
the year 2098 to run six future simulations on 
2,376 months (starting in 1900). 

In every experiment we ran, Learn- a had a 
lower mean annual prediction error than the 
current default practice in climate science, 
which is to average over all the climate model 
predictions. Furthermore, Learn- a surpassed 
the best climate model’s performance in all 
but two experiments (historical global annual 
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and historical monthly Africa). Even then, 
Learn- a nearly matched the performance of 
the best climate model. Similarly, Learn- a 
surpassed (batch) least-squares linear regres- 
sion in all but two experiments (a global an- 
nual future simulation and a monthly future 
simulation for North America) and, again, its 
performance was still close. Learn- a’s outper- 
formance of batch linear regression on almost 
all experiments suggests that the data’s non- 
stationary nature, coupled with the limited 
amount of historical data, poses challenges to 
a naive batch algorithm. Lurther experiments 
with a variety of different batch-learning algo- 
rithms would test this hypothesis (we recently 
achieved encouraging results using sparse 
matrix completion, an unsupervised batch 
technique). 13 

The plots in Ligure 3 show squared error 
between predictions and (simulated) observa- 
tions, from 1900-2098, on a future simulation 
using global annual temperature anomalies. 
We plot Learn- a’s learning curve against the 
best and worst climate models’ performance 
(from the remaining ensemble, computed in 
hindsight), and the performance of the aver- 
age prediction over the ensemble of remaining 
climate models. Learn- a successfully predicts 
one climate model’s predictions up to the year 
2098, which is notable, because future predic- 
tions vary widely among the climate models. 
We ran 10 future simulations with global an- 
nual temperature anomalies, each with a dif- 
ferent climate model providing the simulated 
observations. In each simulation, Learn- a suf- 
fers less prediction error than the mean over 
the remaining models’ predictions, on 75-90 
percent of the years. Ligure 2 shows a marked 
fan-out among the model predictions that in- 
creases into the future. Over time, the model 
predictions’ mean performance diverges from 
most individual model trajectories. In the his- 
torical global annual experiment, Learn- a 
suffers less prediction error than the model 
predictions’ mean for more than 75 percent of 
the years. 

Geospatially Tracking Climate Models 

Previous work provided techniques to combine 
the predictions of the multi-model ensemble, 
at various geographic scales, by considering 
each geospatial region as an independent prob- 
lem. 1,7 However, climate patterns across the 
globe often vary significantly and concurrently, 
so assuming that each geospatial region is 


independent could limit the performance 
of these previous approaches. We therefore 
extended our work on the TCM algorithm 
as follows: 1 

• We used a richer modeling framework that ac- 
counts for GCM predictions at higher geospa- 
tial resolutions. 

• Using a nonhomogeneous HMM, we modeled 
the neighborhood influence among geospatial 
regions. 

• We ran experiments to validate these exten- 
sions’ effectiveness. 

We proposed a new algorithm: Neighbor- 
hood-augmented Tracking Climate Models 
(NTCM). 3 This algorithm extends the TCM 
algorithm to operate in a setting where the 
GCM’s predictions are assessed at a higher spa- 
tial resolution. NTCM takes into account re- 
gional neighborhood influences when it forms 
predictions. The NTCM algorithm is fully de- 
scribed elsewhere, 3 and differs from TCM in 
two main ways: 

• We modified the Learn- a algorithm to include 
influence from a geospatial region’s neighbors 
in how the algorithm updates the weights over 
experts (the multi-model ensemble of GCMs’ 
predictions in that geospatial region). 

• Our master algorithm runs multiple instances 
of this modified Learn- a algorithm simulta- 
neously, each on a different geospatial region, 
and uses their predictions to make a combined 
global prediction. 

We modified the time-homogeneous HMM 
that generates the TCM algorithm 1,10 in or- 
der to model the neighborhood influence. 3 
We instead use dynamically updated (nonho- 
mogeneous) transition dynamics (the proba- 
bilistic model of how the best climate model’s 
identity changes over time). These dynamics 
depend on a geospatial neighborhood scheme: 
the set of nearby regions that influence the re- 
gion in question. Researchers can use a variety 
of shapes and sizes to define the neighbor- 
hood scheme. 

We first used a simple neighborhood scheme 
in which the four immediately adjacent re- 
gions (north, south, east, and west) are the 
geographical region’s possible neighbors. 3 We 
ran experiments with our algorithm on his- 
torical data, using temperature observations 
and GCM hindcasts. We obtained historical 
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Figure 3. (a) The squared error between predictions and (simulated) observations, from 1900-2098, 
on a future simulation using global annual temperature anomalies. This shows the algorithm tracking 
the predictions of one climate model using the predictions of the remaining 19 as input, with no true 
temperature observations. The simulated observations are from the best-performing climate model from 
the whole ensemble (previously computed on historical data). The black vertical line separates past from 
future, (b) Here, the graph zooms in on the yaxis. 


climate model data from the CMIP3 archive 
(www-pcmdi.llnl.gov/ipcc/about_ipcc.php) us- 
ing the Climate of the 20th Century Experi- 
ment (20C3M). Figure 4 compares the new 
algorithm’s performance over time: NTCM 
(indicated in red and blue) using 45 -degree 
square regions, versus global Learn- a (as in 
the original TCM algorithm, 1 indicated in 
black) in a graph illustrating cumulative an- 
nual prediction error. This graph indicates 
that, for most years, and in particular for years 
later in the time-sequence, the NTCM algo- 
rithm’s cumulative global prediction error was 
less than that of the global Learn- a algorithm 


used in TCM, with NTCM’s /? = 1 variant (full 
neighborhood influence) obtaining lower pre- 
diction error than that of the /? = 0 variant (no 
neighborhood influence). 

Challenge Problems for 
Climate Informatics 

Climate scientists are working on many dif- 
ferent kinds of problems for which machine 
learning, and other computer science expertise, 
could potentially have a big impact. Here, we 
provide a brief description of a few examples 
(with a discussion of related work in the lit- 
erature) that typify these ideas, although any 
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Figure 4. Comparison of the Neighborhood-augmented Tracking 
Climate Models (NTCM) algorithm's performance over time. 

The cumulative annual prediction error of NTCM is shown, using 
45-degree square cells, compared to Learn-a's prediction error. 


specific implementation mentioned shouldn’t 
be considered the last word. 

Improving Multi-Model Ensemble Predictions 

As previously discussed, researchers have de- 
veloped and are improving multiple climate 
models; currently there are about 25 centers 
across the globe, many with multiple mod- 
eling groups. Each model shares some basic 
features with some of the other models, but 
generally researchers design and indepen- 
dently implement unique models. In coor- 
dinated “Model Intercomparison Projects” 
(MIPs) — most usefully, the Coupled MIP 
(CMIP3, CMIP5), the Atmospheric Chemis- 
try and Climate MIP (ACCMIP), and the Pa- 
leoClimate MIP (PMIP3) — modeling groups 
attempt to perform analogous simulations 
with similar boundary conditions, but with 
multiple models. These multi-model ensem- 
bles offer the possibility to assess what features 
are robust across models. They also facilitate 
the study of the roles of internal variability, 
structural uncertainty, and scenario uncer- 
tainty in assessing predictions at different 
time and space scales. Finally, MIPs provide 
multiple opportunities for model-observation 
comparisons. Questions of interest include 
the following: 

• Are there “skill” metrics for present or past 
model simulations that are useful for future 
predictions? 


• Are there weighting strategies that maximize 
predictive skill? How would researchers explore 
this? 

Weather and seasonal forecasts also raise these 
questions, but because of the long time scales 
involved in climate prediction, they’re more 
difficult for climate researchers to address. 8,14 
Our research provides an example of how re- 
searchers can use the MME to predict climate 
change. 1,3,9,13 

Parameterization Development 

Climate models need to be able to model the 
relevant physics at all scales, even those finer 
than any finite model can currently resolve. Ex- 
amples include cloud formation, turbulence in 
the ocean, land surface heterogeneity, ice floe 
interactions, and chemistry on dust particle 
surfaces, to name a few. Typically, scientists, 
using physical intuition and limited calibration 
data, dealt with these phenomena by using pa- 
rameterizations (physically coherent approxi- 
mations of the bulk effects) that attempted to 
capture the phenomenology of a specific pro- 
cess, and its sensitivity in terms of the (resolved) 
large scales. As observational data become more 
available, and direct numerical simulations of 
key processes become more tractable, the po- 
tential for machine learning and data mining 
techniques to help define and automate new pa- 
rameterizations and frameworks is increasing. 
For example, some researchers have used neural 
network frameworks to develop atmospheric 
radiation models for use in GCMs. 15 

Paleo Reconstructions 

Understanding how climate varied in the past, 
before the onset of widespread instrumentation, 
is of great interest to climate scientists. The cli- 
mate changes seen in the Paleo record dwarf 
those in the 20th century, and hence could pro- 
vide insight into the significant changes we ex- 
pect this century. However, Paleo data are even 
sparser than instrumental data, and they aren’t 
usually directly commensurate with the in- 
strumental record. Paleo records (such as water 
isotopes, tree rings, pollen counts, and so on) 
could indicate climate change by proxy, but of- 
ten nonclimatic influences affect their behavior, 
and sometimes their relationships to more stan- 
dard variables (such as temperature or precipita- 
tion) are nonstationary or convolved. Scientists 
face an enormous challenge in bringing togeth- 
er disparate, multiproxy evidence to discover 
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large-scale patterns of climate change — or, in 
contrast, in building models with enough “for- 
ward modeling” capability that they can use the 
proxies directly as modeling targets. 16 

Data Assimilation and Initialized 
Decadal Predictions 

The main way in which sparse observational 
data is used to construct complete fields is 
through data assimilation. This field of re- 
search concerns how to update physics-based 
models with observed data, and includes such 
technology as the ensemble Kalman filter. 17 
Data assimilation is a staple of weather fore- 
casts, and the various re-analyses in the at- 
mosphere and ocean. In many ways, this is 
the most sophisticated use of the combina- 
tion of models and observations, but its use 
in improving climate predictions is still in its 
infancy. For weather time scales, this works 
well. For longer-term forecasts (seasons to de- 
cades) the key variables are in the ocean, not 
the atmosphere; climate scientists have yet to 
fully develop a climate model initalization in 
which the evolution of ocean variability mod- 
els the real world in useful ways. 18,19 Climate 
scientists find the early results intriguing, if 
not convincing, and many more examples 
are slated to come online in the new CMIP5 
archive. 20 

Advances in data assimilation could also ben- 
efit other areas of computer science. In robotics, 
for example, when a robot uses a physics-based 
model for the dynamics governing its move- 
ment, often it must also incorporate informa- 
tion gleaned from the onboard sensors, and, in 
some cases, additional real-time instructions 
from a human controller. 

Developing and Understanding Perturbed 
Physics Ensembles (PPE) 

The spread among different model predictions 
from different modeling groups is one way to 
measure the models’ “structural uncertainty.” 
However, we can’t consider these models a 
controlled random sample from the space of 
all plausible models. An approach that leads to 
a more accurately characterized ensemble is 
to take a single model, and vary multiple (un- 
certain) parameters within the code, generat- 
ing a family of similar models that nonetheless 
sample a good deal of the intrinsic uncertainty 
that arises when we choose any specific set of 
parameter values. Researchers have successfully 
used these Perturbed Physics Ensembles (PPEs) 


in the Climateprediction.net and Quantifying 
Uncertainty in Model Predictions (QUMP) 
projects to generate controlled model ensem- 
bles that they can systematically compare to 
observed data, and then make inferences. 21,22 
However, designing such experiments and ef- 
ficiently analyzing sometimes thousands of 
simulations is a challenge, but one which is in- 
creasingly going to be attempted. 

W e hope that this article encour- 
ages future work, not only on 
some of the challenge problems 
proposed here, but also on new 
problems in climate informatics. A huge and 
varied amount of climate data is available, pro- 
viding a rich and fertile playground for future 
machine learning and data mining research. 
Even exploratory data analysis could prove use- 
ful for accelerating discovery. There are count- 
less collaborations possible at this intersection 
of climate science and machine learning, data 
mining, and statistics. We strongly encourage 
future progress on a range of emerging prob- 
lems in climate informatics. se 
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